Elastic Net Constrained Stereotype Logit Model for Ordered Categorical Data

نویسنده

  • Kellie J Archer
چکیده

In biomedical studies the outcome of interest may be an ordinal, rather than a dichotomous, class label such as progression of disease. An example of an ordinal variable is drug toxicity levels evaluated as mild, moderate or severe. Another example is the Breast Imaging Reporting and Data System (BIRADS) [1] classification system. After a mammogram is read, a subjective score is assigned based on the condition of the breast tissue. These categories are: Category 0 Incomplete; Category 1Negative; Category 2 Benign; Category 3 -Probably Benign; Category 4 Suspicious Abnormality; Category 5 Highly Suspicious of Malignancy; and Category 6 Known Biopsy Proven Malignancy. The ordinality of these categories is evident. As another example, when cancer treatments are applied there is usually an interest in how patients respond. A typical way to measure this response is called the Revised Response Evaluation Criteria in Solid Tumors (RECIST) [2]. Based on a wide variety of tools, as well as defined rules for classification, Revised RECIST defines the responses as: Complete Response, Partial Response, Stable Disease, and Progressive Disease. The types of models used to model ordinal data include the multinomial, adjacent category logit, continuation ratio logit, proportional odds logit, stereotype logit, and cumulative link models [3]. These models have the assumption, among others, that there are considerably more observations than variables. However, there are many types of data for which there are more variables than observations. When using microarray based, or other high throughout technologies, due to the expense of obtaining samples, there may be few observations but thousands of variables. The afore-mentioned models are not estimable by traditional means or without additional assumptions. Although there are data dimensionality reduction techniques, such as principle component analysis, due to the severely unbalanced nature of the data it may still be impossible to satisfactorily reduce the subset of variables to be less than the number of observational units without a significant loss of information in the data. This paper is concerned with the development of an ordinal classification model using the Least Absolute Shrinkage and Selection Operator (LASSO) and ridge penalizations to accommodate the case where there are considerably more variables than observations. This described procedure uses the stereotype logit model [4] with the applied penalty in an attempt to overcome said problems. The proposed method is applied to simulated data. An algorithm is presented in which the above penalized likelihood is utilized to model high dimensional data with an ordinal outcome; the algorithm is applied to an actual data set with promising results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Working Paper Series Categorical Data Categorical Data

Categorical outcome (or discrete outcome or qualitative response) regression models are models for a discrete dependent variable recording in which of two or more categories an outcome of interest lies. For binary data (two categories) probit and logit models or semiparametric methods are used. For multinomial data (more than two categories) that are unordered, common models are multinomial and...

متن کامل

Missing exposure data in stereotype regression model: application to matched case-control study with disease subclassification.

With advances in modern medicine and clinical diagnosis, case-control data with characterization of finer subtypes of cases are often available. In matched case-control studies, missingness in exposure values often leads to deletion of entire stratum, and thus entails a significant loss in information. When subtypes of cases are treated as categorical outcomes, the data are further stratified a...

متن کامل

Fitting Stereotype Logistic Regression Models for Ordinal Response Variables in Educational Research (Stata)

The stereotype logistic (SL) model is an alternative to the proportional odds (PO) model for ordinal response variables when the proportional odds assumption is violated. This model seems to be underutilized. One major reason is the constraint of current statistical software packages. Statistical Package for the Social Sciences (SPSS) cannot perform the SL regression analysis, and SAS does not ...

متن کامل

Comparison of methods in the analysis of dependent ordered catagorical data

Rating scales for outcome variables produce categorical data which are often ordered and measurements from rating scales are not standardized. The purpose of this study is to apply commonly used and novel methods for paired ordered categorical data to two data sets with different properties and to compare the results and the conditions for use of these models. The two applications consist of a ...

متن کامل

Stereotype Ordinal Regression

There are a number of reasonable approaches to analysing an ordinal outcome variable. One common approach, known as the Proportional Odds (PO) Model, is implemented in Stata as ologit. If the assumptions of the PO model are not satisfied, an alternative is to treat the outcome as categorical, rather than ordinal, and use multinomial logistic regression (mlogit) in Stata. This insert describes a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017